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Abstract: We consider a matrix-valued Gaussian sequence model, that is, we observe 
a sequence of high-dimensional M x N matrices of heterogeneous Gaussian random 
variables Xij^^ for i £ {1, AT}, j G {1, A'^} and fe £ Z. The standard deviation of 
our observations is ek" for some e > and s > 0. 

We give sharp rates for the detection of a sparse submatrix of size mxn with active 
components. A component {i,j) is said active if the sequence {xij^k}k have mean 
{9ij,k}k within a Sobolev ellipsoid of smoothness r > and total energy '^f.dfj ^. 
larger than some r^. Our rates involve relationships between m, n, M and A'^ tending 
to infinity such that m/M , n/N and e tend to 0, such that a test procedure that we 
construct has asymptotic minimax risk tending to 0. 

We prove corresponding lower bounds under additional assumptions on the relative 
size of the submatrix in the large matrix of observations. Except for these additional 
conditions our rates are asymptotically sharp. Lower bounds for hypothesis testing 
problems mean that no test procedure can distinguish between the null hypothesis 
(no signal) and the alternative, i.e. the minimax risk for testing tends to 1. 

AMS 2000 subject classifications: 62H15, 60G15, 62G10, 62G20, 60C20. 
Keywords and phrases: Asymptotic minimax test, detection boundary, hetero- 
geneous observations, Gaussian white noise model, high-dimensional data, indirect 
observations, inverse problems, sharp rates, sparsity. 



1 

imsart-generic ver. 2011/11/15 file: Detection-matrice-vl4.tex date: January 22, 2013 



C. Butucea and G. Gayraud/Sharp detection of smooth signals in a sparse matrix 2 
1 . Int ro duct ion 

Large matrices are used to model more and more applied problems in different areas such 
as signal theory, genomics, medical statistics. In case we observe large matrices of data 
on some period of time, we propose a procedure to test whether a smaller submatrix 
only contains active components, that is smooth signal with some given smoothness and 
significant energy (measured by its L2-norm). This step should be taken as a preliminary 
step for dimension reduction. 

This problem can be stated equivalently in the Gaussian sequence model of coefficients 
(say Fourier coefficients) of the signals. We propose to deal with the Gaussian sequence 
model, as it is easier for our computations and discuss later on the alternative interpre- 
tation as signal detection. We include heterogeneous Gaussian observations in order to 
include the setup of indirect observations. 

More precisely, we consider the following Gaussian sequence model 

Xij,k = Cij Gij,k + (^<7ij,kVij,k, i ^ I = {'^,-,M}, j e J = {1,...,N}, k ez, (1.1) 

where {r/jj fc}jg7-jg j^gz; is a sequence of independent standard Gaussian random variables, 
o"jj,fc > are known and e > is the noise level. The M x A^-matrix ^ = [(,ij](i.j)£ixJ: is 
deterministic (unknown) and has elements in {0, 1}. 

In what follows, the standard deviations cjjj^fc are supposed to be the same for all com- 
ponents of the matrix, that is cjjj^fc = ffc for all k do not depend on in / x J. We 
assume throughout the paper that, for some fixed given s > 0, 

Cfc ~ 1^1'^; for large enough integer values of \k\. 

On the one hand, the case s = reduces to the case of direct observations of the signal. 
In that case, we could generalize our results to unknown (but constant) variance a. On 
the other hand, the case s > corresponds to signals observed in inverse problems like 
convolution with some independent noise, tomography etc. 

The polynomial behaviour of fj^ as k grows to infinity corresponds to mildly ill-posed 
inverse problems. We refer to [2] for more discussion on the relation between the sequence 
model with increasing variance and inverse problems in the Gaussian white noise model. 

The matrix-valued sequence 6 = [Cij{%,fe}fcgz](i,j)e7x j is the quantity of interest. We 
want to detect from observations in the model (1.1) whether there is only noise or whether 
there are 'active components' in 6, corresponding to where S^ij = 1. When a com- 
ponent (i,j) is active, we assume that the corresponding sequence {Oij^k}k belongs to a 



imsart-generic ver. 2011/11/15 file: Detection-matrice-vl4.tex date: January 22, 2013 



C. Butucea and G. Gayraud/Sharp detection of smooth signals in a sparse matrix 3 

Sobolev ellipsoid and has significant total energy, i.e., {9ij^k}k £ 5](t, re), r > 0, > 0, 
where 

S(r,r,) = {0G/2(:Z): {27.?^ Y.\^r < I; el> rl) (1.2) 

fcez feez 

In this paper, we assume that ^ has a specific structure, i.e., it belongs to 

TM,Nim, n) = matrix of size M x : 3 C /, = m and 3 C J, = n 
such that S^ij = ll((i, j) £ x B^)} , 

where the non null elements form a submatrix with m rows and n columns. We shall 
always denote by A^ and B^ those rows and columns where the matrix ^ G TM,N{inn,,n) 
has non null elements. 

The testing problem of interest is the following 

Hq : e = o 

Hi{T,r^) : e e eM,N{T,r^,m,n), 

where, for r, > and for m, n, M and N large, such that m < M and n < N, we define 

@M,N{T,r^,m,n) = {9=[Cij{0ij,k}kez]{i,j)€ixJ- ^ €TM,N{m,n), 

and for ah G yl^ x B^, {%,fc}fc G S(r, r^)}. 

The alternative hypothesis consists of matrices of size A-I x N containing mainly noise, 
except for elements in some submatrix of size m x n containing sequences of Fourier 
coefficients of signals with Sobolev smoothness r and energy (L2 norm) significantly large 
(larger than r^). 

Remark 1.1 We may also assume that the matrix ^ has entries either or 1, such that 
j)£ixJ ^ij = m^n. That means that we know the number of non null elements of the 
matrix but they can be found anywhere in the matrix. This case is exactly the vector case 
previously studied by [3] under the sparsity condition that the number of active components 
mn satisfies mn = {MN)^~^, where b G (0, 1) corresponds to the sparsity index. 

Denote by Pq and IPq the distributions under the null and the alternative, respectively. 
Denote also by IEq, Varo and lEJg, Var^ the expected values and variances with respect 
to iPo and IP^, respectively. Set Oij = {9ij^k}k£Z', indices of probabilities, expectations or 
variances which are expressed in terms of non-overlined subsequences of 6 mean that they 
correspond to active components. 

For any test procedure ip, that is, any measurable function with respect to the ob- 
servations, taking values in [0,1], set (^{ip) = 1Eq{'iP) its type I error probability and 
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l3{Tp,@M,N{T,re,m,n)) = sup -^(1 ~ ''A) its maximal type II error probabil- 

^£0i\/,Jv{T,re,m,n) 

ity over the set <dM,N{T-,fe^fn^n). Let us denote by 

7(-i/',6A/,Ar(r,r,,m,n)) = + /^(V', 0M,Ar(r, r,, m, n)) 

the total error probability of ijj and denote by 7 the minimax total error probability over 
G>M,N{T,re,m,n) which is defined by 

7 := ^{@M.N{T,re,m,n)) = m^J{^p,eM,N{T,re,m,n)), 

where the infimum is taken over all test procedures. We can not distinguish Hq 
and Hi{T,re) if 7 — t- 1 and distinguishability occurs if there exists V such that 
7(V',0M,Ar(r,r,,m,n)) 0. 

The aim of this paper is to derive distinguishability conditions and separation rates 
for alternatives QM,N{T,re,m,n) and to determine statistical procedures ip (at least of 
asymptotic Q-level) which achieve these separation rates. By separation rates, we mean a 
family such that 

7^1 if — ^ 0, 

re 

T 

7(^,9Af,Ar(r,re,m,n)) if +00. 

Te 

By sharp separation rates, we mean a family such that 

^e 

7 — )• 1 if limsup — < 1, 

Te 
T 

7(V', 0Af,Ar(T, r^, m, n)) — ;> if liminf-^>l. 

The asymptotics for model (1.1) are given by e — )• and, as we are mainly interested in 
high-dimensional settings, by 

Tfl Tl 

m, n, M and N +00, p = — ^ 0, q = — ^ 0. (1.3) 

Here and later asymptotics and symbols o, O, ~ and x are considered under e — )• and 
m, n, M and N such that (1.3) holds. 

The plan of the paper is as follows. Section 2 explains how this model is related to 
the multivariate Gaussian white noise model and how the inverse problem reduces to 
heterogenous observations in our Gaussian sequence model. In Section 3 we define the 
test procedure and give sufficient conditions such that the minimax risk for testing tends 
to 0. The construction of our test procedure involves solving an optimization problem. 
Section 4 presents the lower bounds for our problem and proofs are given in Section 5 and 
the Appendix. 
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2. Sparse high-dimensional signal detection 

Let us see that the previous problem arises in some classical statistical models and hence, 
it has a different interpretation. When dealing with high-dimensional data, we model 
functions of many variables with additive models. For many situations where additive 
models are employed see Stone [8] and references therein. Let us consider the multivariate 
Gaussian white noise model 

dX{t) = f{t)dt + e-dWit), i G [0, 1]"^, d G N, (2.1) 

e > and W{t) is the Wiener process. When estimating / in a nonparametric model, the 
curse of dimensionality makes the rates exponentially slow for large dimension d. Additive 
models, where f{t) = Yl'j=ifj{tj)^ ^ [0' 1] lo fj ~ *-* 1 to d, are 

estimated with much faster rates, but the global estimation risk still grows in a linear way 
with d. It is assumed in [3] that the univariate signal functions fj belong to a class S{t, r^), 
i.e., it has Sobolev smoothness r and total energy p larger than r^. A function / is 
Sobolev smooth if it belongs to L2([0, 1]) such that J |/(n)p(27r|u|)^'^(in < 1 (where / is 
the caracteristic function of a function /) and r is called its smoothness. 

If we need to cope with very high dimension d, sparsity assumptions help reduce the 
dimension. In Gayraud and Ingster [3], it was assumed that only d^~^ for some < 6 < 1 
coordinates are significantly active, i.e., f{t) = Yl'j=i Cjfj{tj)j ^ {0) 1} for all j from 1 
to d such that = d^~^. They solved the following test problem: 

Hq : all = 0, (no signal is detected in data) 
Hi{T,r^) : there exists d^~'' values of j where = 1 and fj £ S{T,r^). 

Different sharp detection rates were obtained along the values of < 6 < 1. 
In our paper, we assume a sparse matrix structure for our additive model: 

M N 

■^(*) = Y.Y. ^ijfiji^ij)' % e [0' 1] and e G TA/,7v(m, n), (2.2) 
i=i j=i 

such that fij = for all i, j. We call the component (i, j) active if ^ij = 1 and, in that 
case, we suppose that the signal in that coordinate belongs to the class 5(r, r^). 

Let us reduce the sparse additive model (2.1) such that (2.2) holds to our initial model. 
Consider {(pk}kez orthonormal basis of L2P, 1] such that ipQ = 1 (e.g., the Fourier 
basis). Define the multivariate orthonormal family, for t G [0, 1]^^^^^ 

^jj,fc(i) = ^kiUj) ■ foitih) = ^k{tij)- 

imsart-generic ver. 2011/11/15 file: Detection-matrice-vl4.tex date: January 22, 2013 



'[0,1]' 



C. Butucea and G. Gayraud/Sharp detection of smooth signals in a sparse matrix 6 
Then, project the signal in (2.1) on these functions: 

= I <^i,,k{t)f{t)dt + e- I ^ij,k{t)dWit) 

J[0_i]AfxiV J[0,l]A^x^ 
~ ^ij I ^k{tij) fij{iij)dtij + 6 • TJij^kt 

Jo 

where {rjij^k} i.i.d. standard Gaussian random variables. We get our initial model for 
(^ij,k = fkfij and (7fe = 1. 

Therefore, our test problem can be written: 

Hq : all = (no signal is detected in data) 
Hi{T,r^) : there exists G TM^Nifn^n) and for S^ij = 1 it holds that fij G S{T,r^), 

i.e., there exists a matrix ^ in TM,Nii^, n) such that the signal in active coordinates 
has Sobolev smoothness r and total energy larger than r^. 

The variance of our observations are allowed to increase ak ~ |A:|*, s > 0. Indeed, let us 
suppose that our additive model is observed as an inverse problem. That means that we 
observe 

dX{t) = Kf{t)dt + e • dW{t), t = [tij]i^j G [0, 1]*'^''^ (2.3) 

for some linear operator K, with / given as in (2.2) and such that Kfij = 0. In the 
convolution model, for example, the signal is observed with an additive independent noise 
having density g, than K f{y) = J f{y — u)g{u)du. 

We suppose that K*K is a compact operator having eigenvalues decreasing poly- 
nomially to as /c tends to infinity. This corresponds to mildly ill-posed inverse problems. 
Whereas, in the case of well-posed inverse problems, cr| < o"^ form a bounded sequence. 

Then, we consider a singular value decomposition of that is families of orthonormal 
functions {v3fe}fc and {V'fclfc such that Kipk = f^'^V'fc a^id iC'i/'fc = o'J^^fk- Therefore, let 
ipl^ = 1 and = '4'k{tij), and project (2.3) on this family: 

M N 

yij,k ■■= y^y^^ih ^ij.k{t)Kfih{tih)dtih + e- ^ij^k{t)dW{t) 

= iij I ipk{u)K fi,j{u)du + e ■ riij^k- 
Jo 

Note, moreover, that /J Vfc • Kfij = K*tpk • fij = cr^^ Jq Vfc • fij = (^k^(^ij,k- Then, let 
Xij,k = (^kyij,k to get the model (1.1). 
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Note that Butucea and Ingster [1] studied the particular case where Oij^k = o,'iL{k = 0) 
and the variance of the noise is a given fixed a. If we have in mind the Fourier basis, 
it comes down to studying periodic signals. The asymptotic rates for testing were given 
in terms of n, m, and M. Here, we replace the periodic signal with arbitrary smooth 
signal. Moreover, we add here the case of heterogeneous variables which include mildly 
ill-posed inverse problems. 



3. Testing procedures and their asymptotic behaviour 

Consider the following family of weighted x^"type statistics: for in / x J 

where {wk)k is a sequence of weights such that i«/t > forall /c G Z and X^^g^^fc ~ '^1'^- 

In order to define the weights {if^lfcez that will appear in the optimal test procedure, 
we have to solve the following extremal problem. Recall that E(r, r^) denotes the Sobolev 
ellipsoid defined in (1.2), with r > and > 0, and {crk}kez is a sequence of positive 
real numbers. We define the sequences {it;^}fcez and {^^jfcez as solutions to the following 
optimization program: 

V u;^ f ^ V = sup inf V Wk f ^ V . (3.1) 

^ ^ i {wk)kelHZ): wk>0; 1 ' ^ ' "ez ^ '^^ 

Let us denote by Ve := K(re) = J2kez'^k (^fc/^fc)^i so that a(re) := K/e^ is the value of 
the optimization problem (3.1) at the optimal point. 

Let us discuss heuristically why we need to solve this problem, before giving the solution. 
Note that under the null hypothesis our statistic becomes tij^w = Ylk&z '^kivfj k~^) ^^d it 
is centered and reduced (due to the normalization Ylkez ^fc ~ l/^)- Under the alternative, 

iB0,,(%,u-) = E"^'^(— ) • (3.2) 

In order to distinguish the alternative from the null at best, we need to consider the worst 
parameter Oij under the alternative and then maximize over possible weights Wk > 
verifying the normalization constraints '^k'^k ~ '^1'^- 

Proposition 3.1 Let {crk}kez be a sequence of positive real numbers such that ak ~ |A;|* 
as \k\ large enough, for a given s > 0. Then, the optimization problem (3.1) has the 
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8 



following solution: 



-|^, is such that maxtt;^ < r^^'^'^^ 0; 



~ c(r, s)re 2" , a{r^) c{t, s)e , 
where the asymptotics are taken as k ^ oo and as r^ — )• 0, with 

'^^''^ " ^K2^ Kf''^" (4. + l)(4. + 2r + l)' 

4V2T(27r)2^ ,12 1 

Ko = r-, r and K-i = 1 , 

(4s + 2T + l)(4s + 4r + l) 4s + 1 4s + 2r + 1 4s + 4r + 1 ' 

and where = max(0, x). 

The proof of Proposition 3.1 is postponed to Appendix. Note that {w^}k and {^^}/c 
check the constraints in (3.1), that is, Ylki'^^k)'^ = h Ylki^k)'^ = ''^(l + ^(l)) ^.nd 
^^.(27rA;)^'^(0^)^ = 1 + o(l), as — t- 0. It is worthwhile to note that due to Proposi- 
tion 3.1 and relation (3.2), we have 

- "^(•'.) (3.3) 

inf ]Ee,^{t^J,^*) = a(r,) (3.4) 

and note also that the sequences {w'^}k and {^^}a; have a finite number T of non null 
elements, but T grows to infinity as — )• 0. 

Define the test procedures, 

^P^' = ^t^' > H), with t^' = -L= Yl (3-5) 

^ {i,j)eixj 

^scan ^ I(f«™" > /^), with i**™" = max V ti.^. (3.6) 



CeTM,iv(m,n) ^Jmn 

where H and K are positive and it;* = {w'l}k£Z is the sequence of weights which solves 
the optimization problem (3.1). 

The following theorem gives the upper bounds for the testing rates of the previously 
defined procedures. 

Theorem 3.1 Assume (1-3). Suppose that — )• and recall that 

a(r,)~c(r,s)e-V2+(4^+i)/{2-). 
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2 

1. The linear test statistics ip^ defined by (3.5) has the following properties. 

2 

Type I error probability: if H ^ oo, then oo{tlj^ ) = o(l). 
Type II error probability: if 

a^{r^)mnpq — )• +00, (3-7) 
choose H such that H < c ■ a{r^)yjmnpq, for some < c < 1, then 

2 

/3(V'^ ,eA/,7v(r,re,m,n)) = 0(1). 

2. The scan test statistic defined by (3.6) has the following properties. 

Take = 2(1 + 5){m ■ \og{p~^) + n ■ \og{q~^)), for some small 5 > Q, and suppose 
moreover that K'^r\^^ /{mn) tends to asymptotically. 

Type I error: we have u}{iIj^'^"'"') = o(l). 

Type II error: if 

lim inf ^ , \." ^ , > 1, 3.8 

2{m • log(p~^) + n • log(g~^)) 

then P{r"''\QMAr,re,m,n)) = o(l). 

2 

Consider the test procedure which combines ij^^ and ijj^^^ as fohows 

= max(V'^',V''™''). 

As a consequence of Theorem 3.1, the test procedure tp with H and K properly chosen is 
such that 'y{ip,QM,NiT,ri:,m,n)) = o(l) as soon as either (3.7) or (3.8) hold. 

The procedure is rather simple to implement. However, there are difficulties for imple- 
menting the scan procedure. Indeed, computing the scan statistic t^'^"'"' implies computing 
standardized sums over all submatrices of size ?7t, x n in the large matrix M x N. This 
is computationally infeasible for large values of M, N, m and n. However, a heuristic al- 
gorithm can be implemented as in [1] , following [6] and [7] , which is a random procedure 
finding local maxima. With a sufficiently large choice of random initial values in the algo- 
rithm there is a large probability that the algorithm actually finds the global maximum 
that we aim at. 

4. Optimality of the detection boundaries 

We prove here optimality results for the rates that the previous test procedure ip attained. 
However, the optimality is attained under additional hypothesis requiring an 'almost' 
squared matrix in the sense that the relative sizes of the submatrix should of the same 
order in both directions (rows and columns sizes). 
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Theorem 4.1 Assume (1-3) and that 

log log(p^^) log log (g^^) 

log(<?-i) log(p-i) ^ • ^ 

Assume, moreover, that 

m ■ log{p~^) >i n ■ log{q~^) (4.2) 
m ■ log(p-^) + n • log(g-^) j 

= 0(l)e 2r+2s + l . (4.3) 

mn 

7/ is such that the following conditions are satisfied 

a'^{rf:) ■ mnpq ^ 0, (4.4) 

1- a^ire) ■ mn ^, 

limsup— — -. — T- — 7 — r-- < 1, (4.5) 

2(m • log(p-i) + n ■ log(g-i)) ^ ^ 

i/ien mf^j{ip,@M,NiT,r^,m,n)) 1. 

The proof of Theorem 4.1 is given in Section 4. It follows closely the proof in [1] with 
important differences due to the non gaussian likelihoods in this setup. 

4. J. Optimal detection boundary 

Theorems 3.1 and 4.1 together say that, under assumptions (1.3), (4.1), (4.2) and (4.3) a 
detection boundary is defined via the relations 

a^(fe) • mnpq x 1, a^(fe) • mn ~ 2(m • log(p~^) + n ■ log{q~^)). 

Therefore, the detection boundary can be written 

. / 1 / 2(m-log(p-i) + n-log(g'ijy | 

a(r,) X mm <^ , \ S , 

I y/mnpq V mn I 

„ 2(m • log(p~^) + 71 • log(g~^)) 1 

with the constant equal to 1 if < . By Proposition 3.1, 

mn mnpq 

we have that a(re) ~ c{T,s)e~'^(f^)'^~^^'^'^'^^^^^'^'^^ as e — t- and — ?• 0. It implies that 
e^a(re) — )• 0, giving furthermore that one of the following conditions hold: 

e~'^yjmnpq — )• 00 or ■\/2{m ■ log(p^-'^) + n • \og{q~'^)) = o{e^^y/mn). 

The minimax error of the scan test tends to 0, if K'^{f^Y^'^/{mn) — t- and a sufficient 
condition for that is condition (4.3). 

We can recover from these results the rates for one dimensional sequences (i.e. M = 
N = m = n = 1). In this case it is required that a{ff) is asymptotically constant, which 
means ~ g4T/(4T+4s+i) that is the minimax rate for testing one dimensional signal 
with Sobolev smoothness r. 
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4. -2. Universal test procedure 

It is clear that our test procedure introduced in the previous part depends on r^. One 
may be interested in deahng with an universal test procedure, i.e., a unique statistical 
procedure for a range of and for which it is possible to derive the upper bounds as in 
Theorem 3.1. In this section, we describe such a procedure. 

Recall that can be obtained from a{rf^) using Proposition 3.1. This gives x 
(e^a^(re))^/^^^^^*'^^^ . Moreover, the optimisation problem (3.1) gives associated optimal 

2 2 

weights = w\{r^). Let us consider obtained from a'^[fe ) x {mnpq)~^ and the 
associated test procedure ip^ defined in (3.5) for the weights w^{f-^ ). Similarly, put ff^'^'^ 
obtained from a?'{fl^°'^) ~ 2(mlog(p~"'^) + n log(g~^))/(mn) and ijj^'^"''^ as in (3.6) with the 
weights iti^(f|™'^). 

The type II error of if^^ and are stated for \\m.a{r^) /a{r^) — t- +oo and 

liminf a(re)/a(re) > 1, respectively, which are equivalent due to Proposition 3.1, to 
limre/fe —7- +00 and liminf r^/fe > 1. Due to Proposition 4.1. in [3], one gets that any 
Oij G J:{T,Bf,) with B>1, 

^e.,(W(r.)) > B^aif,). (4.6) 

~ 2 

Taking H and K as in Theorem 3.1 and using (4.6), we derive for tp^ and 
^scan same upper bounds as in Theorem 3.1. For ip^^"''^^ note that the condition 
{K/ {^Jnm) m.d.yiw^{fl^'^^) = o(l) is satisfied as soon as (?7ilog(p~^) + nlog(q"^))/(nm) = 

5. Proofs 

Let us start with a preliminary result that gives an approximation of the moments gener- 
ating function of tij^^* under Hq. 

Lemma 5.1 For any real number A such that Amax^it;^ = o(l), 

Eo(exp(A%^.)) = exp(^y(l + o(l))^ . 
The proof of Lemma 5.1 is postponed in the appendix. 

5.1. Proof of Theorem 3.1 

Observe that under Hq, tij^w are i.i.d. random variables with zero mean and unit variance. 
Indeed, one gets Varo(iij,u;) = J2k'^V^^''^(''lij k)/^k ~ '^J2k'^k ~ ^- Under the alternative. 
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for all Oij e Il(r, r^), 



92 

ij,k 



q2 



Due to the previous relations, for any 6 £ QM,NiT,re,m,n) 

> VMNpq ■ aijTf:) = ^mnpq ■ (5-1) 
where the penultimate inequality follows from (3.4). Moreover, for the variance we have 

= WV ^ Var^^Jt,,,^*(I[(e., = l) + ]Ite,/l))] 

{i,j)eixj 



1 „ ..^2 

■^/MiV 



< 1 + 4maxw^ ^ ^ ^^^ lEg{t^ ). (5.2) 



Recall that maxw^ 'wO ^ ^^^^ Proposition 3.1). 

k 

2 

Type I error probability of . Since max^u;^ = o(l), the asymptotic standard 
normality of under the null follows from Lemma 3.1 in [4] then, 

IPo{t^^>H) = $(-if) + o(l), 

where $ stands for the cdf of a standard Gaussian random variable. 

Type II error probability of ij:^^ uniformly over 0A/,Ar(7", r^, m, n). We deduce 

2 2 — 

from (5.2) that Varg(t^ ) = 1 + o{lEg{t^ )), uniformly over 6 G QM,N{'T-,r^,n,m). 
For all 6 in QM,N{T,r^,m,n), by Markov's inequality and relation (5.1), 



< 



{IB^{tx')-H) 



2 

2n 



^ 1 + 4 maxfc wl]Eg{t^ ) /VMN 



1 

(1 - cfM^(t^ ^ (i - cY^/MNIE^{tx 



< \ + = 0(1) 



imsart-generic ver. 2011/11/15 file: Detection-matrice-vl4.tex date: January 22, 2013 



C. Butucea and G. Gayraud/Sharp detection of smooth signals in a sparse matrix 13 

provided that a[r^)^Jmnpq — t- +oo and H < c ■ a{r^)^mnpq for some < c < 1. 
Type I error probability of ijj^^"-^. 
Under the assumption (1.3), we can check that 

log(CS • C^) ~ m ■ log(p-i) + n ■ \og{q~^). 

Applying Markov's inequahty, 



i&TM,N{m,n) ^ {i,j)&A^xBf 
= C^}C^1Pq{ tij ^* > K), 

{i,j)&Ai.xBi. 



( K \"''' 

iBo(exp(^=tii.^*)) • (5.3) 



Set A = K/y/mn with K = y^2(r+5)log(C^Z7|j), for some small 5 > and note that 
K/ ^ ran max/c w*^ < Kr\^^'^'^'' / yjmn = o(l) by assumption in our theorem; then, applying 
Lemma 5.1 we obtain that 

\2 

Eo(exp(Atii,»0) = exp(y (1 + o(l))). 
Next, by plugging (iBo(exp(-^tn,^*)))"" = ^M^C^ + o{l))) into (5.3), we obtain 

jp^^^scan y K) < Q^C]^e-^'/2(l+o(l)) = o{l), 



due to the choice oi K = y^2(1 + 5) log(C]^C]J}), for some small 5 > 0. 

Type II error probability of ip^^"-"- uniformly over QM,N{T,fe,^,n). For any 
9 G QM,N{Tii^e,^,n), it exists A C I and B C J such that = m, = n and 
Cij = ]!((«, j) e Ax B); using the inequality t''™" > -7= E(i,j)6^xS ^u.^-*' we obtain 



^ {i,j)eAxB 



< 



Due to (3.4), we have 

(i,j)eAx_B (i,j)6Ax_B 

By assumption (3.8) we have lim inf a{rf)^mn/ K > (1 + 5)"^/^, which implies that 
K < a{r^)y/'mn{l + S)/{1 + 6) for some 6 > and then K < ca{rf:)^/mn < 
c]E}g{—^= j)&AxB^ij,w*) some < c < 1 if 5 is small enough. 
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Now, acting as for getting Equation (5.2), we have 

Ya,r-g{^L= tijuj*) = — Vare (%.^*) 

^ ii,j)eAxB {i,j)eAxB 

< 1 H ]Eg {tijw* 

(i,i)eAxB 



< 1 + 



4 maxfc vj^ 



I / J ^ij,i' 



imn y/mn 

{i,j)&AxB 



Finally, 



(1 — cj'^a'^yTejmn (1 — cya[r^)mn 



5.2. Proof of Theorem 4-1 

The usual steps for proving the lower bounds are the following. First, we bound from below 
the minimax total error probability by reducing the set of parameters. Next, we choose 
a prior on the reduced set of parameters and bound the testing risk from below with a 
Bayesian risk. Finally, this Bayesian risk is large if a x^-distance between the likelihoods 
under the null and under the mixture of alternatives is small. 

Recall that {^^jfcez is the solution of the optimisation problem (3.1) and let us choose 
a matrix £^ in the set TM^NiTn^n), ^ = I((i,j) G A x S) where A = A^\s & set of m rows 
out of M and B = a set of n columns out of A^. Denote by 

rM,N{T,r^,m,n) = {6 = [6i{±6'^}fc](ij)e/x J, C G TM,Nim,n)}. 

This is the reduced set of parameters, i.e., a subset of the alternative QM,NiT,r^,m,n) in 
our test. 

A prior measure on the reduced set will choose ^ with equal probability in the set 
TM,Ni'm-, n); given ^, the (^ij)'s associated with non-zero S^ij are i.i.d. and for such that 
(^ij = 1, the prior will choose with equal probability between 0^ and —0'^, independently 
for each k. We can write T^tj^k = ^i^-oi + '^e^)) where 6 stands for the Dirac measure, and 
TTjj = Y\j^ T^ij^k- Let us define 

^ = (jm(jn n 

the global prior on 0's in TM,N{T,r^,m,n). 

Let us write the likelihood ratio of one active component, i.e., when is such that 
Cij = 1) ^ 

--T^{{xij,k}k) = Ylexp{--^)cos}i{xij,k-^). 

^k k 
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Set X = [{xij^k}k]{i,j)- Then the hkehhood ratio with respect to the nuh hypothesis of 
our observations becomes: 

^^^^^ = ^(tK^i'^lc^,.)) = cic^ ^ S^^^' 

i&TM,N{m,n) 

where 

In order to prove indistinguishabihty, we see that 

7 = inf {w{'4))+_ sup ]Eg[l - i;(X)]) 

'AGIO,!] 06eA/,iv(r,r,,m,n) 

> inf {w{ii))+ sup ]Eg[l-^(X)]) 

> inf 7r(0)iE^[l-^(X)]) 

V'S[o,ij _ 

^67j\/,jv(r,rE,m,n) 

> inf {EQ{^{X)) + E^[{l-i:{X))L^{X)]). 

i/)e[o,i] 

This infimum is attained for the hkelihood ratio test Tp*{X) = ]I(L^(X) > 1). By Fatou 
lemma, we have 

hminf 7 > ^Eq (hm inf ^* (X) + (1 - V*(X))L^(X)), 

which imphes that 7 — ^ 1 if L^{X) — )■ 1 in iPg-probabihty. In order to prove this sufficient 
condition, it is enough to check that 

lEo{Lj,(xf) < 1 + 0(1). 

However, this last inequality can not be obtained; indeed, too many events with small 
probability are summed up in the expected value of the square likelihood ratio. Therefore, 
we modify slightly the likelihood ratio, by truncation, as follows: 

*^ ^ C6TM,iv{m,n) (ijOeAjxBj k k k 

where the event is defined for some small (5i > as follows 



a(r,)Vhl ,. ..^ ^ ^ 

, ^ ' {i,])&AvxBv k 



logcosh(M/t • ^■''^ ) 



\/V G TM^N{h, I) such that Ay C A^, By C , 
and V/i, V/ such that 6im < h < m and 5in < I < n} 



+ ^ < Thi, (5.4) 
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with Uk = and 

Tl = 2{\og{Cl,&^){l + A,) + log(mn)), 

for small = o(l). Moreover, in order to finish the proof of the lower bounds, 
Lemma 5.2, 1. will require to be such that 

A^ ■ \og{C%&jq) —?■ oo and AJ{a{r^) max wl) —?■ oo. (5.5) 

k 

We suggest to take, e.g. the largest value between Ai^^ = 1/ log((a(re) maxfc wD^^) and 
A2,, = l/log(C7]f,Cjv). 

Let us see that, as < f^, 

j_ 

aij-f) ■ moKwX < 0(l)a(fe)fe^ 

1 I 1 2 

< 0(l)a(fe) 4^+''^+ie4T+4''+i 

< 0(l)(a(fe) • £2^+2^+1)*'"+^^+^ = o(l), 

by assumption (4.3). This also implies that ^i^^ = o(l) and that A^: = o(l). 

In the following we shall denote hy V <Z ^ any matrix of size M x N such that V = 
I[((i, j) G Ay X By), with Ay C and C Note that 1/ may have value 1 only in 
a submatrix where ^ has value 1. 

The idea is that the random variable in this event is truncated at the values predicted by 
large deviations and this is sufficient to diminish the second-order moment of the likelihood 
ratio. 

Let us denote T = Ci^^Tm jv(m,n)-'^S- Then, for some fixed 6 > 0, let us consider the event 
£ = {\L,,(X) - 1\ > 6} 

iPo{£) = iPo(^nr) + iPo(£:nr^) 

< 6-\lEo{Ljr(xf) - 1) - 2{Eo{L^(X)) - 1) + (iPo(r) - 1)] + iPo(r^), 

where denotes the complement of T. Then, it remains to prove the following lemma to 
complete the proof of Theorem 4.1. 

Lemma 5.2 Under the assumptions of Theorem 4-1 we have the following: 

1. PoiT) ^ 1. 

2. E^{U{X)) ^ 1. 

3. E^{L^{Xf) < l + o(l). 

The proof of Lemma 5.2 is postponed in Section 6.2. 
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6. Appendix 

6.1. Auxiliary results for the lower bounds 

2n 

Lemma 6.1 As N — )• +oo, n — )■ +oo and n/N — )• 0, if q = then 

N — n 

lPngiN,n,n)iX >k)< iPB(„,g)(y > k), 

for all integer numbers k, where 1-LQ{N ^n^n) and B{n,q) denote the hypergeometric dis- 
tribution and the Binomial distribution, respectively. 

Proof of Lemma 6.1 The proof is based on Theorem 1 (d) in [5] that states that for 
positive integers M, A, m, n such that m < M and for q £ (0, 1], 

Png{M,A,m)iX > ^) < ^BMiY > k) 

for ah integers k if and only if inf(?7i, yl) < n and CJ^j^j^/CJ^j- > (1 — f?)" . 

In our case, it is sufficient to check that asymptotically C'^_^/C^ > (1 — q)^- As 
N —7- oo, n — 7- oo and n/N — )■ 0, using Stirling's formula and after simplifications, one 
gets, 

^ {{N-n)\f 
C% N\{N-2n)\ 

~ exp {{2{N -n) + l) \og{N -n) - {N +]^) log(iV) - [N -2n+'^) \og{N - 2n) 

( n \ 2n 

~ exp [{2{N - n) + 1) log(l - _) - (iV - 2n + -) log(l - — ; 

2 2 

n ,n 



exp(-2- + o(-)). 



2n 



which is larger than (1 — q)^ provided that g > this is satisfied with q = — . □ 

Lemma 6.2 If r^ ^ such that a(re) • max^w^ = o(l), then for any A > such that 
A = 0(1), 

iEo(exp(A Zn,fc)) = exp(^^^(l + 0(a(r,)«;^))). 

k 

Proof of 6.2 Let us see that for bounded A > 0, for u — )■ and a standard Gaussian 
random variable rj, we have: 

iE(exp(A • (logcosh(n • ^) " y + y + 0{u^)))) = exp(^^ + 0{u^)). 

This proof can be adapted in [3] (cf Lemma A.l). Now, we apply this result for each k and 
note that uf. = a{r^) ■ < a{r^) ■ max^ w'^ = o(l) by assumption. Using J2k ^fc ~ 2a^(re), 
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we get 

iBo(exp(A5^Zn,fc)) = exp(^ ^ 4(1 + 0(ui))) 

k k 



exp(^Ar^(l + 0(«(r.) • maxu;^))). 

Z k 



□ 



6.2. Proof of Lemma 5.2 

Take a small 5 > 0. The detection boundary a{r^) satisfies (4.5), so the most difficult case 
is when the limit is close to 1. Therefore, we shall assume that 

a^{r^)mn ~ {2 — 6){m ■ log(p~^) + n ■ log{q~^)). 

This implies 

2 _ l0g(P~^) , l0g(g~^) ff. 1 ^ 

a (re) X 1 . (6.1) 

n m 

Let us see that the random variable in (5.4) is 

with Z,,,, = logcosh(n,.^)-^ + 4, 

where it^ = 6^/{€ak) and {^^}fc is the optimal sequence defined in Proposition 3.1. Recall 
that 6'^ is null for k > T and thus the sum over k contains a finite number of non null 
elements. Moreover, due to (3.3), recall that we have 

Y,ut=2a\r,). (6.3) 

k 

1. We shall prove that IPq(T'-'') — )• 0. Let us write more conveniently 



= U U \J{yv>TM} 

6in < I < n 

= U U {yv>TM}. 

dim < n < m, 
5in < I < n 
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Therefore, we have 



iPo(r^) < E E jPo{yvTM>Ti{) 

dim < n < m, 
Sin < I < n 

dim < h < m, 
5in < I < n 



< 



hi 



6im < h < m 
5in < I < n 

Using Equation (6.3) and applying Lemma 6.2 for A = Thi/{a{r^)\/hl) which is 0(1), 
one obtains 

iBo(exp(-^^^Zn,fc)) = exp(SL(i + o(a(r,) maxu;^)). 
a{re)v nl ^ ^™ « 

Recall that T^^ = 2\og(C\^0^){\ + A^) + 21og(mn) where A = o(l) by (5.5). Therefore 

iPo(r^) < E Cjj,Cjve-^-(exp(^(l + 0(a(re)max«;^))))'^' 

bim < h < m, 
5in < I < n 

^ exp(-^ + ^0(a(r,) m^xwl) + log(C,\C^)) 

5im < h < m, 
5in < I < n 

< V exp(-log(Cjj^C]v)(^c - a(r,) max u;^) - log(mn)) = o(l), 

^ — ' k 

5im < h < m, 
6in < I < n 

for large enough m, n, M and N, as we have both ■ log(C]j^Cjy) — )• oo and 
a(re) maxfcWfc = o{Ai^^) = o{A^), by (5.5). 



2. We have 



Eo{U{x)) = m-^^ E ^W]i(r«)) = ^«(r?), 
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which tends to 1 if and only if iP^(r^) — t- 0. As we can write 

dim < h < m, 
5in < I < n 

where IPy is such that 

= n U^w{-4)^^Muk^)=eMYva{r.)Vhl-lh^). 
dlrn z ecu I 

{i,j)&AvxBv k 

Then, applying Lemma 6.2, one obtains for any positive A such that (A + 1) = 0(1), 



Pv{Yv>Tm) = IPv{Yva{r,)Vhl>Thia{r,)Vhl) 

< lEv[exp{XYva{r^)Vhi)] exp{— XThia{rf.)Vhl) 

= iBo[exp((A + l)Yva{r,)Vhl)] exp(-//i— ^ - XThia{r,)Vhl) 
= exp((A + l)2//i^!M(l + o(l))-//i^^-Ar^ia(r,)^/^) (6.4) 
The minimum value for the right-hand side of (6.4) is 



exp(-^^k^4:^l^(l + o(l))) 



which is achieved for A = "^'''y— — 1. Note that A satisfies (A + 1) = 0(1) and that 

Cl( T'g 1 v Alt 

asymptotically > 2=5 > 1- 

In conclusion, 

Pd^f) < E Ctci,eM-l{Tki-a{r,)^l)\l + o{m 

6im < h < m, 
6in < I < n 

< E CMexp(-^r2(l + o(l))), 

6im < h < m, 
6in < I < n 

for some c{6) > small with 6. Therefore, due to the asymptotic value of T^i, 
P^iTf) < 0(1). 
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3. We have, for = G x Bi) and 6 = j) G ^2 x B2), 

lEoiLUX)) = 77.4^ where 



0^- 



9ih,i) = n n u^M 



(6* 9* \ 

n n^^P(-4S)^o(cosh2(x,,-fc^)]i(r5,nr5,)] 

and the function g depends on the sets Ai, A2 and Bi, B2 only through the number h of 
common rows of Ai and A2 and the number I of common columns of Bi and i?2- After 
some combinatorics we can write 



lEo{L±{X)) = IE{g{H,L)), 

where H and L are independent random variables having hypergeometric distribution 
'HQ{M, 171,171) and 'HG{N,n,n), respectively. Let us see that, for any < h < m and 
< Z < n, 

\ogig{h,l)) < log mexp(-4^)iEo (cosh2(xij-fcA_)n 



nexp(-^)-(^exp(-2^) + l 



= hi log I^JJ cosh(-^) j =: hi ■ D. 
Therefore, lE{g{H, L)) < lE{e^^'^) for D which has the following asymptotic equivalent 



^fc ^ k ^ '^k 

= ^(l + o(l)) = a2(r,)(l + o(l)), (6.5) 
which holds under Assumption (4.3). Indeed, this Assumption implies that 

m&Xwla{n) < ^ < C6"V^(2s+l)/r < ^^_2^~2+(2s+l)/r ^ 

We shall split IE{g{H, L)) into the sum /i + I2, where 

11 = lE{g{H,L)-n{HD <l)), 

12 = E{g{H,L)-n{HD>l)). 
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For /i, we use the stochastic ordering in Lemma 6.1 and replace the hypergeometric 
distributions of H and L with binomial distributions of ~ Bin{m,p) and L ~ Bin{n, q), 
respectively. We have 

h < ]E{E{e^^'^^HD <l)\H) 

< iE((l + ^(e^^-l))"]I(FD< 1)) 

< lE{exp{CnqHD)) = (1 + p(e^"«^ - 1))™, 

for some constant C > 0. By assumption (4.2), (6.1) and (6.5), Dn x log(p~^), which 
implies that Dnq x {qlog{p~^)) and this is an o(l) by assumption (4.1). So, by assumption 
(4.4), h < e-Kp{CmnpqD) = 1 + o(l). 

The rest of the Section is devoted to the proof that I2 = o(l). We shall further split the 
expected value into the sum of I21 + I22, where 

121 = IE{g{H, L) ■ n{HD > 1) • ]I(L < n^i)), 

122 = HaiH, L) ■ n{HD > 1) • ]I(L > n6i)), 

for some fixed 61 > 0, small enough such that Dn6i < log(p~^)/2 and that Dm5i < 
log{q~^)/2. On the one hand 

hi < ^ e^''^ lP'Hg{M,m,m){H = h)IPy^g(^N^n,n){L = I) 

D-^<h<rn,0<l<n5i 
< ^ g/i(/D-log(p-i)(l+o(l)))^ 

D-^<h<m,0<l<n5i 

as IP-HQ (N,n,n){L = /) < 1 and by using Lemma 5.3 in [1] for \og{]Py^g^^M,m,m){H = 
h)) < h\og{p){l + 0(1)). Now, under the constraints in the sum, ID < Dn6i < (1/2 + 
o(l))log(p"i). This implies that h{lD - log(p-^)(l + o(l))) < -hlog{p-^){l/2 + o(l)) < 
-D-'^log{p-^){l/2 + o{l)) X -n. Therefore, 

hi ^ mne^^^"', for some fixed B > 0, 

and this is an o(l). 

We can also split h2 into the sum of /221 + -^222, where 

/221 = lE{g{H, L) ■ n{HD >l,H< m5i) ■ ]I(L > n^i)), 
/222 = lE{g{H, L) ■ n{HD >l,H> m5i) ■ ]I(L > n5i)). 

It is easy to check that /221 = o(l) as we previously did for hi- 
On the other hand, we can write 

I222 = lE{e"^^-R{n)), where Ti = {{h,l), m5i < h < m,n6i <l<n}. 
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Note that under the event Ti we have T^i = (2 + 5){h/m • mlogijp^'^) + l/n • n\og{q^^)) > 

We divide again the set T-L in disjoint sets 

•Hi = {{h, l)e-H: Tl, > 2Tl^^} and = {{K /) G ^ : < 2Ti,^}. 
Let us go back to L^{X) and write it as 



1 / 6 



Uk 



exp(-a^(rJmn/2) ^ / v^/, , /2;i,-,fc , ul ut 

= E -p E Eaog-«M^-^)-f + f 

^ ^ ?eTM.iv{m,n) \{i,i)GA^xB^ k 

where = 9\/{eak) is such that X^^^if. = 2a?'{r^) (see (3.3)). 

Now, we give a tighter upper bound for g{h, I) than the one used for Ii. Using the same 
notation as to define Yy in (6.2) and for any matrix ^, we define the random variable 
^« = ^dvX! ^{^,)^A,xB, Ek Then, we write 

g{h,l) = e-'''(^^)'""iE'o(e"(^^)^(^«i+^«2)]I(r5, nr^J) 

< e~"^''^"-''^'^"'"^"^'"""'iEo(e^"^^'-'^^~'^^^'^«i"'"'^«2))^ (6.6) 

for some J > that we will choose later on. In order to deal with 12225 we keep in mind that 
we consider only submatrices .^i and ^2 having h common rows and I common columns, 
such that {h,l) € Ti. Denote by V the submatrix of common rows and columns for ^1 and 
,^2, that is 

V = l{{i,j)e{A^,xB^,)n{A^,xB^,)), 
and by = ^1 — y (respectively V2 = ^2 — ^)- Therefore, 

^/^{Y^^ + ^€2) = Vmn - hl{Yv, + iVa) + ^VhlYy. 
Replace this into the equation (6.6) and get by Lemma 6.2 

-^222 

< ^ exp(-a2(re)mn + 2r„„J + (a(re)\/m^ - J)^(l + — ))iP^g(Af,„^,m)(^)^we(iV,n,n)(0 

< V exp(-a'(r,)mn + 2r^^a(r,)V^- , , r - (/ilog(p-^) + / log((7-^))(l + o(l))) 

^-^ 1 + hi (mn) 

< Yl eM-Hre)VT^-T^n? + ^^-^{1 + 0(1))), 
^-^ mn + hi 2 

{h,l)e'H 
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for a{rf)\/mn — J = Tmn/{^ + hl/[mn)). Note that, for 5 > small enough, there exists 
(52 > such that {a{re)^mn — TmnY ^ '^2^mn- Moreover, for {h,l) G Hi, 



(Mog(p-i) + /log(<7-i))(l + o(l)) < llJ}L .TM^±<}^ 



mn + hi mn + hi 2 

— -^mn (^I j J T // \ ~ ~l~ ^^m,n)i 

mn 1 + hl/[mn) 

which is asymptotically less than o{T^n). This implies that /222 = o(l) over the set "Hi. 

Finally, we give a yet slightly different upper bound for g(h, I) in order to deal with /222 
when (/i, /) belongs to T-L2- 

g(h,l) = e-"'(''^)'""iBo(e''(''^)v^(^«i+^f2)]I(yv' < ^^,0) 

^ ^-a^{r,)hl+TMJ j^^(^^{2a{r,)Vhi-J)Yv+J{Yv-ni)^Yv <Thi)) 
< ^-a^ir,)hl+TMJ+iMre)Vhi-jf/2 



Take J = 2a{r^)\/hl — T^i which is indeed positive for {h, I) in T-L2 and obtain 

< exp{-a\r,)hl + 2a{r,)VhlThi - -f). 
Moreover, denote = h\og{p~^) + l\og{q^^) and see that D'^^hl/{mn) < Df^^. We get 

-^222 

< Yl exp{-a\r,)hl + 2a{r,)VhlThi - -^)IPngiM,m,m)ih)IPng{N,n,n){l) 

< ^M-Hre)Vhl - Traf + ^ - dI{1 + o(l))) 

{h,l)e'H 

< Y eM-jDl + o{l)Dl) = o{l). 

ih,l)eH 

□ 



6. 3. Proof of Lemma 5. 1 

For the sake of simplicity, we omit in this part the indices i and j so that Uj^yj* and T^j^ jj 
are denoted by t^* and r/^, respectively. 

Under Hq^ observe that t^* = Ylk''^k(''lk ~ '^i^h "Hk *~ AA(0, 1). Using the fact that 
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E(e*''fe) = —7:7, for i < -, we obtain for A such that AmaxwJ = o(l), 

^ ' (1 - 2t)i/2' - 2 fc ^ ^' 

Eo(exp(At^O) = lleM->^wl-l\og{l-2Xwl)) 



exp (^A2^K)2(l + o(l))j 

expT^Cl + oCl))] , (6.7) 



where the last equahty holds since ^^(w^)^ = -. This ends the proof. □ 



6.4- Proof of Proposition 3.1 

These computations can be found in Ingster and Suslina [4], but we give the sketch of 
proof for the convenience of the reader. 

Let us change variables in problem (3.1), by defining = for all A; € Z. We have 

{^fe}fe belongs to I1(t, r^) if and only if {vk}k belongs to S(r, r^), where 

S(r,r,) = I {vk]k G im : t^fc > 0; (27r)2- | ^I^V^t;, < -L; ^^,^2 > ^ 1 
The problem (3.1) is equivalent to 

^ sup inf ^-WfcWfc 

^ {«'fc}fc:Efe«'i=l/2,«'fe>0 K}fc6S(r,r,) ^ 

= _ sup ^ ^ inf ^w^fci^fc 



e 



{"'fe}fc:Efc«'i<l/2,«'fe>0 K}fc6S(r,r,) ^ 



= ^ inf sup y^^WkVk, 

e K}fcGE(T,rO {t«fc}fc:Efc"'I<l/2,«;fe>0 ^ 

by the minimax theorem on convex sets. Now, use the Cauchy-Schwarz inequality to see 
that 

\/2 sup y^^WkVk = (y^^^fc)^^^ 

{"'fe}fc:Efc'^fc<l/2,Wfe>0 

and the equality holds for Wk = f^fc(2 f|)~^^^. As we denoted by = X]fc(t'fc)^ we get 
k = </ 



wt = v\/ {^/2Ve), which is equivalent to 



wt = for ah keZ. 



It follows that solving the problem (3.1) reduces to solve the optimization program 

K.2 \ 



imsart-generic ver. 2011/11/15 file: Detection-matrice-vl4.tex date: January 22, 2013 



C. Butucea and G. Gayraud/Sharp detection of smooth signals in a sparse matrix 



26 



By the Lagrangian multipliers rules, one gets for Ai G M and A2 G M the following system 
of equations 

Put, for all k eZ,Vk = val {l - , where v = T = ^ (if)'^^'"^ and 

(x)+ = max(0, x). 

We evaluate the solution of the previous system as T goes to infinity. Using ~ \k\^ 
for \k\ large enough and some s > 0, the last two equations in the previous system become 

that gives 

/kA^ -i 1 (k2\^ 2+1^ 
i \ — and v ~ Te . 

Note that T ^ 00 provided that — ?• 0. It further gives 



\k\<T 



^ 4s+l 



Finally, it is straightforward that 

maxiM,, < — max 
fc K o<|fc|<r 



V n 4S+2T + 1 4a+4T+l 2a J_ ^ 



□ 
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